Using Repeated Patterns across Comparable Articles for Paraphrase Acquisition

ثبت نشده
چکیده

We focus on paraphrases for information extraction: expressions which should produce the same extraction output. These expressions are acquired automatically from comparable news articles (articles from the same day, on the same topic). Candidate paraphrases are paths in predicate argument structure starting from matching anchors (typically, names) in the two sentences. By using such syntactically-regularized structures and limiting ourselves to single paths, we increased the likelihood of observing repeated patterns. We measured the frequency of such candidate patterns over a large corpus, and confirmed a correlation between frequency and their accuracy as paraphrases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Paraphrase Acquisition for Information Extraction

We are trying to find paraphrases from Japanese news articles which can be used for Information Extraction. We focused on the fact that a single event can be reported in more than one article in different ways. However, certain kinds of noun phrases such as names, dates and numbers behave as “anchors” which are unlikely to change across articles. Our key idea is to identify these anchors among ...

متن کامل

Large Scale Acquisition of Paraphrases for Learning Surface Patterns

Paraphrases have proved to be useful in many applications, including Machine Translation, Question Answering, Summarization, and Information Retrieval. Paraphrase acquisition methods that use a single monolingual corpus often produce only syntactic paraphrases. We present a method for obtaining surface paraphrases, using a 150GB (25 billion words) monolingual corpus. Our method achieves an accu...

متن کامل

Automatic Paraphrase Acquisition from News Articles

Paraphrases play an important role in the variety and complexity of natural language documents. However they adds to the difficulty of natural language processing. Here we describe a procedure for obtaining paraphrases from news article. A set of paraphrases can be useful for various kinds of applications. Articles derived from different newspapers can contain paraphrases if they report the sam...

متن کامل

Interrogative Reformulation Patterns and Acquisition of Question Paraphrases

We describe a set of paraphrase patterns for questions which we derived from a corpus of questions, and report the result of using them in the automatic recognition of question paraphrases. The aim of our paraphrase patterns is to factor out different syntactic variations of interrogative words, since the interrogative part of a question adds a syntactic superstructure on the sentence part (i.e...

متن کامل

Using Discourse Information for Paraphrase Extraction

Previous work on paraphrase extraction using parallel or comparable corpora has generally not considered the documents’ discourse structure as a useful information source. We propose a novel method for collecting paraphrases relying on the sequential event order in the discourse, using multiple sequence alignment with a semantic similarity measure. We show that adding discourse information boos...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005